Overview

Column

Action vs. Comedy: Comparing Different Styles of Film Music

A film’s soundtrack has a profound influence on how we perceive what we see on screen, and it is difficult to imagine movies without any background music. With the exception of musicals, our attention is rarely directed explicitly to the music, and yet our experience changes significantly because of the music.

In blockbuster movies, music is usually created or chosen for every single scene in order to underline what is happening in that scene specifically. But can we find general patterns in the types of music chosen for certain types of movies, that is, certain genres?

More specifically, are there systematic differences between music that is composed or selected for action movies and music that is composed for comedy movies? These two types of films usually entertain their audiences in quite different ways. While action movies use suspense, thrill, and impressive visual effects to excite their audience, comedy movies often provide a way for their viewers to relax, wind down, and get a feeling of happiness and pleasure. Since music is used as a means to achieve these quite different goals, it would make sense that the music composed and chosen for each genre would be quite different as well.

To obtain the corpus, the most commercially successful action and comedy movies of 2018 were considered. The box office charts from Box Office Mojo (https://www.boxofficemojo.com/yearly/) were used to determine what movies to include, along with IMDB’s information about genre (https://www.imdb.com/). For each film, IMDB lists up to three genres. A movie’s soundtrack if and only if the full soundtrack album was available on Spotify, and if the genre of that film according to IMDB was either action or comedy, but not both. This was done starting with the most commercially successful movie (Black Panther) and down the charts until 25 full action movie soundtracks and 25 full comedy movie soundtracks were obtained.

Click through the drop-down menus on the top of the page to explore the audio features of film music according to Spotify API, how they differ between action and comedy film music, and what else stands out.

Column

Angry Action, Happy Comedy

Column

Valence-energy map

Column

The graph on the left allows for conclusions to be drawn both about the features of film music in general, as well as about how action movie music and comedy movie music differ from each other.

It appears that in both cases, there is a large of cluster of tracks with a very low valence, approaching zero. While this is normally interpreted as sad (in the case of low energy) or angry (in the case of high energy) music, the low valence may also be a result of film music’s character of being in the background and merely accompanying the visual scenes.

While valence seems to be low in general among film music, this is less true for comedy music. Compared to action film music, comedy music has especially more tracks falling in the top-right quadrant of the valence-energy map, which is considered the quadrant describing ‘happy’ music. This make intuitive sense - we expect music in comedy films to be uplifting, funny, happy. On the other hand, a lot more action movie tracks than comedy movie tracks seem to fall in the top-left quadrant, usually describing ‘angry’ music. Again, this is in line with our intuitions about music in action movies: Fast-paced, suspenseful, intense but not usually very happy-sounding.

The graph also seems to suggest that comedy soundtracks contain more major songs compared to action soundtracks, a pattern that will be explored on the next page.

Functions of chroma

Column

Top of the Tower

Overnight Sensation

Column

The valence-energy graph under the Audio features tab shows that compared to comedy movies, action movies have more tracks that fall in the upper-left quadrant, low valence but high energy, often characterised as “angry” songs. On the other hand, comedy movies seem to have more tracks falling into the upper-right quadrant, high valence and high energy, often characterised as “happy” songs.

On the left, in the first tab, we see a spectrogram and cepstogram for a typical “angry” song from our sample drawn from action movies, Top of the Tower, from The Equalizer 2. In the second tab, we see a spectorgram and cepstogram for a typical “happy” song from our sample drawn from comedy movies, Overnight Sensation from Ralph Breaks the Internet.

A look at the chromagrams reveals that for the action song Top of the Tower, most of the power is concentrated around C, which is a very dominant note in the song. Moreover, the song achieves its feeling of suspense by creating a dissonance with the use of minor seconds, indicated by the relatively high power around C# and B. In contrast, the comedy song Overnight sensation has its power much more evenly distributed across all tones of the scale, which can be heard in the song that uses lots of diverse intervals, chords, and key change.

Timbre

Column

Top of the Tower

Overnight Sensation

Column

When comparing the same songs, Top of the Tower and Overnight Sensation, it appears that Top of the Tower seems to have a higher level of brightness. However, it also visible that after around 100 seconds, there is a sudden drop in brightness. This corresponds to a monotonous, mellow part of a song that follows an energetic climax.

Dynamic time warping

Column

Mammia Mia

In This Place

Everything I Need

Column

Sometimes existing songs are re-recorded or edited to create an alternative version for a movie. Comparing the movie version to the original verion may tell us a bit about the attributes of film music in general.

In the first tab on the left, you see Dynamic Time Warping applied to two different versions of the song Mammia Mia by ABBA. Specifically, it compares the original version from 1975 to the one performed by the cast of Mamma Mia: Here We Go Again (2018). Interestingly, some sections of the songs seem to be very similar between the two versions, whereas others are not. For the first 45 seconds of the song, we do not see a distinct diagonal line going through the graph, indicating that the versions are vastly different. The film version has a very slow, variable-tempo, acoustic intro, whereas the original song starts as upbeat as the rest of the song. Only from 00:45, the film version sounds more like original version, at which point we also see a diagonal line going through the graph. Looking at the graph as a whole reveals sections of the song where film and original version are similar and where they are more different.

The soundtrack of Ralph Breaks the Internet features two versions of the song In this Place, one instrumental and one featuring Julia Michael’s voice. Compared to the the Mamma Mia graph, we see a much more distinct dark diagonal line going through the whole song. This indicates that the original version and instrumental version are very similar, at least in terms of chroma. However, we also do see that about 10 seconds into the song, precisely where Julia Michael starts singing in the original version, the line becomes a little less clear, indicating that the singers’ voice does have a visible effect on the song’s chroma features.

Finally, Aquaman’s soundtrack features the original version of the song Everything I need by Skylar Grey, as well as a modified version made for the film. The graph on the left reveals that these two versons are relatively similar in terms of chroma, but do contain quite some differences. The fact that the diagonal line does not go through the origin (0, 0) of the coordinate system, but rather intersects the x-axis at around 15 seconds indicates that the original version contains a short intro not contained in the film version.

Self-similarity matrices

Column

Top of the Tower · Chroma

Top of the Tower · Timbre

Overnight Sensation · Chroma

Overnight Sensation · Timbre

Column

The structure of a song can be evaluated by means of a self-similarity matrix. Such matrices have been computed for the archetypical action and comedy songs, respectively, Top of the Tower (action) and Overnight Sensation (comedy). Self-similarity matrices can be computed on chroma and on timbre features, both of which was done here, resulting in a total of four matrices.

The action song, Top of the Tower, sounds relatively monotonous in terms of chroma (mostly lingering on pitch C, somtimes alternating with C#), and more diverse in terms of timbre, with some mellow mostly-electronic sounds in the beginning, and more of an orchestral sound in the end. Both of these observations are illustrated in the self-similarity matrices. The chroma changes relatively little throughout the song, except a part at the end. The timbre stays relatively constant in the first half of the song, but then seems to change quite substantially, as the song begins to sound more epic and orchestral. Both of these features are used to create a feeling of suspense, making this song fit well within an action film soundtrack.

In contrast, in the case of the comedy song, Overnight Sensation, change seems to be the only constant. In terms of chroma, the matrix looks so chaotic that probably the song does not have a clear structure. In fact, I think the point of this song is that there is “a lot going on”, and the lack of a structure is used as stylistic device. Running the song through Chordify yields a small number of chords that repeat relatively consistently, but a clear chromatic structure is still not picked up, probably because the chords are quite unrelated to each other (in a music theoretical sense), so the notes contained in one chord may be completely different from the notes contained in the previous chord. Timbre-wise, there seems to be some structure, with different parts each lasting 20-25 seconds long, but it is still relatively chaotic, especially when compared with the action song.

Building suspense by key

Column

It’s Only One Red Sock

Train Heist

Column

As illustrated in the graph under Audio features, “relaxed songs”, that is, those with high valence and low energy, tend to be written in a major key. However, there are exceptions and it may be interesting to look at one of them. One of these exceptions, from the comedy movie Paddington 2, is the track It’s Only One Red Sock. Listening to it gives us an idea why this song is in this unique position. It sounds like a song that is made for a gripping, suspenseful scene, but it simultaneously has a jolly and playful character, reminding us that it is still from a children’s movie.

The key profile in the first tab on the left reveals that this song has essentially a three-part structure, one dominated by an B minor chord, an interlude dominated by its subdominant, E minor, and a final part that is in B minor again. However, the final section does not quite look like the first one. Indeed, new chords are introducted in between, and notes that are not part of the chord are played in the background, leading to a keygram where there is power in almost every chord label.

Let’s look at a different example, Train Heist from Solo: A Star Wars Story, a track that with his relatively low valence and medium energy, represents a typical Action movie song. This keygram looks comparatively more complex, and sounds like it would accompany a suspenseful scene with many twists - marked by many key changes. A quiet intro in C minor is followed by a slightly more energetic, but still melancholic part in F minor. The following part in D minor slowly builds up the suspense and makes the song increase in energy. The key shifts to E minor and back to D minor. At the song’s climax there is power corresponding to many different chords, in line with the sudden harmonic changes and the use of dissonance in the music. After using predominantly minor keys, the song ends with a finale in E major, sounding significantly happier and more triumphant than the rest of the song, presumably indicating that the heist has been a success.

Classification

Column

Most important features

Classification using all predictors

Classification using most important predictors

Column

Using a classifier on the corpus can give us some indication how different the two types of music are and how well R classifies music correctly as belonging to Action or Comedy soundtracks.

Random Forests (first tab) allow us to identify the most important features, that is, the features that most reliably distinguish action from comedy film music. For our sample, these appear to be timbre components 6, 4, 5, 9, and 7, as well as the features valence, instrumentalness, duration, and danceability.

The second tab shows the classifier’s performance if all features are taken into account. As the mosaic plot indicates, accuracy is fairly high (0.728). Below is the confusion matrix of the model with all features.

          Truth
Prediction Action Comedy
    Action    290    110
    Comedy     81    228

The third tab shows the performance if only the five most important timbre components and the other four most important audio features are included in the model. Accuracy of the classifier increases, but only by a little (0.738). Below is the confusion matrix with only the selected features.

          Truth
Prediction Action Comedy
    Action    277     95
    Comedy     94    243

Conclusions

To be added at a later stage.